Free Energy Node Embedding via Generalized Skip-gram with Negative Sampling

نویسندگان

چکیده

A widely established set of unsupervised node embedding methods can be interpreted as consisting two distinctive steps: i) the definition a similarity matrix based on graph interest followed by ii) an explicit or implicit factorization such matrix. Inspired this viewpoint, we propose improvements in both steps framework. On one hand, to encode similarities free energy distance, which interpolates between shortest path and commute time distances, thus, providing additional degree flexibility. other method loss function that generalizes skip-gram model with negative sampling arbitrary matrices. Compared factorizations used $\ell _{2}$ loss, proposed better preserve pairs associated higher scores. Moreover, it easily implemented using advanced automatic differentiation toolkits computed efficiently leveraging GPU resources. Node clustering, classification, link prediction experiments real-world datasets demonstrate effectiveness incorporating free-energy-based well compared state-of-the-art alternatives.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Skip-gram Model with Negative Sampling

This paper explores an incremental training strategy for the skip-gram model with negative sampling (SGNS) from both empirical and theoretical perspectives. Existing methods of neural word embeddings, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. To address this problem, we present a simple incremental extension of SGNS and provide a thorough theore...

متن کامل

Riemannian Optimization for Skip-Gram Negative Sampling

Skip-Gram Negative Sampling (SGNS) word embedding model, well known by its implementation in “word2vec” software, is usually optimized by stochastic gradient descent. However, the optimization of SGNS objective can be viewed as a problem of searching for a good matrix with the low-rank constraint. The most standard way to solve this type of problems is to apply Riemannian optimization framework...

متن کامل

The strange geometry of skip-gram with negative sampling

Despite their ubiquity, word embeddings trained with skip-gram negative sampling (SGNS) remain poorly understood. We find that vector positions are not simply determined by semantic similarity, but rather occupy a narrow cone, diametrically opposed to the context vectors. We show that this geometric concentration depends on the ratio of positive to negative examples, and that it is neither theo...

متن کامل

Revisiting Skip-Gram Negative Sampling Model with Regularization

We revisit skip-gram negative sampling (SGNS), a popular neural-network based approach to learning distributed word representation. We first point out the ambiguity issue undermining the SGNSmodel, in the sense that the word vectors can be entirely distorted without changing the objective value. To resolve this issue, we rectify the SGNSmodel with quadratic regularization. A theoretical justifi...

متن کامل

Embedding Words as Distributions with a Bayesian Skip-gram Model

Distributed representations induced from large unlabeled text collections have had a large impact on many natural language processing (NLP) applications, providing an effective and simple way of dealing with data sparsity. Word embedding methods [1, 2, 3, 4] typically represent words as vectors in a low-dimensional space. In contrast, we encode them as probability densities. Intuitively, the de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2022

ISSN: ['1558-2191', '1041-4347', '2326-3865']

DOI: https://doi.org/10.1109/tkde.2022.3206175